Lecture 5

Lecture 5 - Spatial Analysis

Definition: Spatial Analysis: The access, measurement, transformation, and modeling of spatial information.

OUTLINE

I. Access to Spatial Data
    A. Attribute-based operations
    B. Spatial-based operations
II. Measurement
     A. Shape
    B. Distance
III. Neighborhood Operations
    A. Roving Window
    B. Trend Surface Analysis
    C. Buffers

I. Access to Spatial data :

Access of spatial data can occur via queries on spatial or attribute data

A. Attribute-Based Operations

Attributes are often the final result of a GIS analysis (Chrisman, 1997). A timber manager may want to know the forest yield in each of the next seven years, an ecologist may be interested in the amount of suitable habitat for a particular species in a region and a politician may want to know the demographic composition of a newly defined Congressional District. The attributes that represent the final result are usually derived from the interaction of spatial and attribute information.

1. 0 Reducing information content

1.1 Generalize or Group: Attributes can be generalized by merging similar sub-classes (i.e. a detailed landcover classification that includes many types of urban classes can be grouped together in a single urban class. The result of this generalization is a merging of polygons in the spatial domain by dissolving polygon boundaries.

1.2 Selection: Select a set of entities based on some combination of their attribute data (i.e. select all soils that are poorly drained on 15% slopes). Structured Query Language (SQL) is the standard data base query language for this purpose(i.e select roads where type = "highway and material = "concrete"). Boolean logic is used for complex selection from an entity with multiple attributes.

Boolean logic for retrieval

operators: AND, OR, XOR, NOT

characteristics: boolean operators not commutative.

examples:
A is all units topsoil clay loam

B is all units topsoil PH > 7.0

C is all units poorly drained soils

X = A AND B

X = A OR B

X = A XOR B

X = A AND NOT B

X = (A AND B) OR C

X = A OR (B AND C)

1.3 Classification: Classification is a procedure which creates ordinal or nominal categories from ratio or interval data (Chrisman, 1997). Category break points are determined by the criteria set for by the person or organization classifying the entity (i.e. criteria for low, medium or high income classes). Generic classification schemes commonly used are equal interval or quantile systems.

2.0 Increasing information content

Rank: convert categorical data to ordinal ranking (i.e. forest cover -> habitat quality)

Evaluate: covert categorical to interval or ratio using the categories as indirect measures of another phenomena (i.e. forest cover -> species density estimation).

Combining Pairs of Input Values:

Operations for interval or higher levels: Sum and difference. (i.e. total population is sum of each group)

Operations on ratio measurements. Proportions (i.e. low income to totall population), density (i.e. number of people per unit area) and rate.

B. Spatial-Based Operations

Point, Line or Area within in Polygon: Select a point, line or area that is contained within a user define polygon.

Line or Area intersected by a Polygon: Select a line or area that intersects with a user define polygon.

II. Measurement

A. Shape Measures

1.0 Permitter/Area Ratio: Provides a measure of the compactness of the polygon. A more complex measure called a Convexity Index: uses circle as comparative shape and provides a measure of compactness or edginess.

The index is: CI = kP/A

where: CI = convexity index

k = a constant based on the size of the circle that would inscribe the polygon

P = perimeter

A = area

2.0 Euler Function: A measure of the fragmentation or spatial integrity of an area (i.e. is it uniform in classes type or is it broken up by numerous inclusions of different classes). Euler number = (# holes) - (fragments - 1)

B) Distance Measures: distance can be a simple Euclidean measurement or it can be modified to reflect the difficulty of moving over a surface that may be rough and have various barriers.

1.0 Euclidean distance: simple d_ij = [(X_i - X_j)² + (Y_i-Y_j)²]^1/2 When measured in all direction from a central point creates an Isotrophic surface

2.0 Functional Distance: friction, relative barriers and absolute barriers create non-isotrophic surface

friction: some type of impediment to movement or flow, broad in scope (i.e. terrain)

relative barrier: similar to friction but occurring in a specific area

absolute barrier: an impasse that must be circumvented

Each of these can be assigned impedance values depending upon how difficult they to travel through.

3.0 Least Cost Path: the shortest functional distance between two points in a coverage.

Steps:

a. Create a cost surface that gives the cost of passing though each cell. This can be a weighted combination several data layers.

b. Select an origin.

c. Set all cells to very high values. Move iteratively from cell to cell and accumulate the cost of moving to the next cell. If the value in the cell is higher than the cost calculated, replace it with the new lower cost. That same cell will be revisited though other paths and may be replace by lower cost solutions.

d. Select a destination.

e. Calculate the least cost path between origin and destination.

Accumulated distance = 0.5 * ( Dist * Cost_x,y) + 0.5 *(Dist * Cost_x-1,y) + Accumulated Distance for cell x,y.

Dist = 1 for adjacent cells and 1.414 for diagonal cells.

Cost_x,y = cell coming from.

Cost_x-1,y = cell going to.

---> Example <---

---> Homework <---

III. Neighborhood Operations:

A local operation that involves a target area (cell in raster) and neighboring areas resulting in a new data coverage that is a function of the target area and the neighboring areas.

A. Roving Windows or Filters: The convolution of an n x n kernel with an image to generate a new coverage. Convolution involves the multiplication of each kernel cell with the corresponding cell in the image and the summing up of each. This provides the value in the new coverage. This process is especially useful for looking at rapid changes in elevations in an elevation model or in smoothing out a model that may have spurious elevations.

B. Trend Surface Analysis: The analysis of the change in surface direction and slope.

1.0 Slope (Gradient): a measure in the change in elevation over the cell.

tan G = [-dZ/dX)² + (dZ/dY)²]** 0.5

2.0 Aspect: the direction of the slope.

tan A' = (dZ/dY) / (dZ/dX)

Aspect = X

where X = 90 (-) A if (-dZ/dX, -dZ/dY)

X = 90 (+) A if (-dZ/dX, +dZ/dY)

X = 270 (+) A if (+dZ/dX, -dZ/dY)

X = 270 (-) A if (+dZ/dX, +dZ/dY)

where: [dZ/dX]_i,j = [(Z_i+1,
j+1 + 2Z_{i+1, j} + Z_{i+1, j-1}) - (Z_{i-1, j+1} + 2Z_{i-1, j} + Z_{i-1, j-1})] / 8dx

[dZ/dY]_i,j = [(Z_{i+1, j+1} + 2Z_{i, j+1} + Z_{i-1, j+1}) - (Z_{i+1, j-1} + 2Z_i,j-1 + Z_{i-1, j-1})] / 8dy

dx = distance between cell centroids in the x direction

dy = distance between cell centroids in the y direction

Kernel for slope and aspect calculations

Example

[dZ/dX]_i,j = [(100 + 2* 95 + 90) - (85 + 2*80 + 70)] / 8 * 10 = .8125

[dZ/dY]_i,j = [(100 + 2*90 + 85) - (90 + 2*80 + 70)] / 8 * 10 = .5625

1) Gradient (G)

tan G = (.81252 + .56252)**.5 = .988

G = 44.65 degrees

2) Aspect (A)

A' = Arctan (.5625/.8125) = 34.7

Aspect = 270 - 34.7 = 235.3 degrees

C. Buffering A distance measurement from a point, a line or an area. Buffers are often used in situations were protection of a resource is needed (buffering streams to prevent timber cutting), in finding the existence of certain entities with respect to other entities (TRI sites within .5 miles of schools). or in demarcating a zone (100-year flood zone).

Raster vs Vector Buffering: Raster buffering of distances provides a unique distance for each cell. Vector buffer is a discrete process that is usually done for a several distances and results in ranges of objects included within and between buffers.